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Invoking  conditions  utilized  to  obtain  numerous  "ideal"  results 
in  nonlinear  programming,  this  paper  summarizes  the  development  of  a basis 
for  calculating  the  first  partial  derivatives  of  a Kuhn-Tucker  triple  and 
the  first  and  second  partial  derivatives  of  the  optimal  value  function, 
with  respect  to  problem  parameters.  In  the  context  of  prior  results,  a 
simpler  but  much  more  general  derivation  of  the  Kuhn-Tucker  triple  deriva- 
tives is  presented,  and  a more  concise  formula  for  the  Hessian  of  the 
optimal  value  function  is  given.  Particularizations  to  the  problems 
with  right  hand  side  constraint  perturbations,  no  constraint  perturbations, 
and  no  constraints  follow  easily  and  are  briefly  treated.  Further  extensions 
and  applications  are  indicated. 
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NONLINEAR  PROGRAMMING  SENSITIVITY  ANALYSIS  RESULTS 
USING  STRONG  SECOND  ORDER  ASSUMPTIONS 

by 

Anthony  V.  Fiacco 

i 1.  Introduction 

Fiacco  [15]  recently  obtained  a theoretical  basis  for  locally 
characterizing  the  differentiability  properties  of  a local  solution  and 
the  associated  Lagrange  multipliers  for  a large  class  of  nonlinear  pro- 
gramming problems  with  respect  to  general  parametric  variations  and  estab- 
lished the  use  of  a penalty  function  method  to  estimate  the  parameter  par- 
tial derivatives.  Independently,  Robinson  [22]  obtained  closely  related 
characterizations  of  the  continuity  properties  of  Kuhn-Tucker  points, 
including  bounds  on  these  quantities,  and  applied  his  results  to  derive 
convergence  rates  for  a family  of  nonlinear  programming  algorithms.  These 
sensitivity  results  are  generalizations  of  a result  presented  in  Fiacco 
and  McCormick  [16,  Theorem  6]  for  a particular  class  of  parametric 
perturbations. 

Based  on  the  results  of  [15] , Armacost  and  Fiacco  [4]  obtained 
general  expressions  for  the  first  and  second  derivatives  of  the  optimal 
value  function  and  gave  concise  expressions  for  the  first  derivatives  of 
a Kuhn-Tucker  triple,  along  with  approximations  of  these  quantities  by  way 
of  penalty  function  calculations.  Subsequently,  Armacost  and  Fiacco  [5] 
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particularized  these  results  for  right  hand  side  constraint  perturbations, 

and  Armacost  [2]  and  Armacost  and  Fiacco  [7]  showed  the  applicability  of  < 

other  (than  the  usual  penalty  function)  methods  in  calculating  sensitivity 
information  as  a byprriuct  of  normal  algorithmic  calculations.  In  particular, 
the  approach  was  shown  to  apply  readily  to  exact  penalty  function  and  aug- 
mented Lagrangian  algorithms.  Buys  and  Gonin  [13]  have  also  independently 
shown  the  results  for  the  latter. 

The  approach  given  in  [15]  was  implemented  computationally  by  Armacost 
and  Mylander  [8].  Computational  experience  has  been  reported  by  Armacost 
and  Fiacco  [3],  [6],  and  Armacost  [2],  [1]. 

This  paper  gives  a concise  summary  of  the  development  of  the  refer- 
enced results,  drawing  heavily  on  the  material  presented  by  Armacost  and 
Fiacco  in  [4]  . However,  the  derivation  of  the  expression  for  the  derivatives 
of  a Kuhn-Tucker  triple  is  appreciably  simplified,  while  a much  more  general 
formula  for  the  derivatives  is  obtained,  and  a more  concise  form  is  given 
for  the  optimal  value  Hessian.  Also,  for  completeness,  we  particularize  the 
results  to  the  problem  with  right  hand  side  constraint  perturbations,  to  the 
problem  with  constraints  not  involving  the  parameters,  and  to  the  unconstrained 
parametric  problem. 

In  Section  3 we  give  the  relevant  basic  results.  In  Section  4 these 
are  applied  to  develop  first  and  second  order  changes  in  the  optimal  value 
function  of  a general  class  of  parametric  nonlinear  programming  problems, 
with  respect  to  general  parametric  variations.  The  usual  Lagrange  multiplier 
sensitivity  result  and  the  indicated  result  given  in  Fiacco  and  McCormick  [16] 
are  obtained  as  particular  instances.  In  Section  5 we  derive  a general 
expression  and  a number  of  formulae,  depending  on  problem  structure  for 
computing  the  first  partial  derivatives  of  a Kuhn-Tucker  triple.  Section  6 
indicates  several  extensions  and  gives  a brief  discussion  of  applications. 
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2.  Notation,  Preliminaries,  and  the  Problem 

The  following  conventions  will  be  used  throughout  the  paper.  En 
denotes  the  usual  n-dimensional  Euclidean  space.  If  xeEn  , then  x is 
an  n x 1 (column)  vector  in  En  . The  superscript  T denotes  trans- 
position. The  symbols  and  denote  the  gradient  and  Hessian, 

respectively,  the  subscript  denoting  the  variables  with  respect  to  which 
the  derivatives  are  taken.  The  gradient  of  a scalar  valued  function  is 
assumed  to  be  a row  vector.  Thus,  if  f:En  x E*1  -*■  E^  is  once  differ- 
entiable in  En  , then  V f(x,e)  = [if (x,e)/3x, , . . . ,3f (x,e)/3x  ] , 

x in 

a 1 x n vector.  If  f is  twice  differentiable  in  x , then 

2 •p 

f = f ) denotes  the  n x n Hessian  matrix  of  f(x,e)  with 

2 

respect  to  x , whose  ij-th  element  is  given  by  3 f (x,e) /3xj3x^ 

_ l. 

for  i,j  = l,...,n  . Consistent  with  this,  if  g:E  x E -*■  Em  is  a 
vector  function  whose  components  gi(x,e)  are  differentiable  in  x , 
then  g(x,e)  denotes  the  Jacobian  of  g with  respect  to  x , an 
m x n matrix  whose  i-th  row  is  given  by  V^g^x.e)  , i = l,...,m  . 

Differentiation  with  respect  to  e is  denoted  similarly,  of  course. 
Additionally,  since  there  is  ample  occasion  to  differentiate  such  quantities 
as  f[x(e),e]  with  respect  to  e , the  notation  3f/3e  = (3f /3e^, . . . ,3f /3e^) 
is  Introduced  to  indicate  and  emphasize  partial  differentiation  with  respect 
to  the  "independent"  variable  e only  as  it  appears  explicitly.  Thus, 
application  of  the  chain  rule  for  differentiation  yields 
Vef[x(e)e]  * V f Vex(e)  + 3f/3e  , where  f and  3f/3e  are  evaluated  at 

x(e)  . Analogous  to  the  gradient,  if  g[x(e),c]  is  an  TO  dimensional  differ- 
entiable vector  function  on  En  x E*1  , then  3g/3e  is  an  m by  h matrix 
where  i-th  row  is  3g , evaluated  at  x(e)  . Arguments  of  functions 

are  often  omitted  to  simplify  the  notation,  when  it  is  felt  that  no  ambiguity 
will  result. 

Turning  now  to  the  problem  of  interest,  consider  the  problem  of 
obtaining  a local  solution  x(c)  of 
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► 

, 


L 


minimize 


sub j ect  to 


where  e is  a parameter  in 


f (x,e) 

gi(x,e)  ^ 0 , i = 1, 
hj(x.O  * 0 , i - 1, 

Eh  and  f , gj.  , h 


P(e) 

• * 

• » P » 

are  real  valued  functions 


We  are  interested  in  studying  the  behavior  of  a local  solution  x(e) 
and  its  associated  (optimal)  Lagrange  multipliers  u(e)  , w(e)  for  small 
changes  in  the  parameter  vector  e , near  a specified  value  of  the  parameter. 
We  are  also  interested  in  the  "optimal  value"  f[x(e),e]  of  P(e)  . 

Without  loss  of  generality,  we  assume  the  specified  value  is  e = 0 . 

In  this  paper,  we  assume  conditions  strong  enough  to  guarantee  the 
existence  and  differentiability  of  x(e)  , u(e)  , w(e)  near  e * 0 . Key 
conditions  are  the  well  known  second  order  sufficient  conditions  for  a locally 
unique  solution  of  Problem  P(0).  These  may  be  found  in  Flacco  and 
McCormick  [16, Theorem  4]  and  in  numerous  current  books  and  papers . For, 
completeness,  we  state  them  here,  in  the  context  of  Problem  P(0). 


Define  the  Lagrangian  of  P(e)  as 

m p 

L(x,u,w,e)  = f(x,e)  - E u.g  (x,e)  + E w.h.(x,e)  (2.1) 

i-1  11  J-l  2 3 

Second  order  conditions  are  intended  to  mean  conditions  based  on  the 

assumption  that  the  problem  functions  are  twice  continuously  differentiable. 

The  second  order  sufficient  conditions  are  said  to  hold  for  Problem  P(0) 

* 

at  a point  x If 


(i) 

x is  a feasible 

point  of  P(0) 

9 

* 

* * T 

it 

* * x 

(il) 

there  exist  u - 

(“j. u^) 

and  w 

* (wx  9 • • • 9 Vp) 

such 

* * 

* * 

* 

* 

that  V^L(x  »u  >' 

w ,0)  - 0 , ut 

giCx  ) 

« 0 and  0 

for 

1 “ 1, . . . ,m,  and 
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(iii)  Z Vx  L(x  ,u  ,w  ,0)Z  > 0 for  every  nonzero  vector  ZeEn 

satisfying  Z g^(x  ) 0 for  all  i such  that  g^(x  ) = 0 , 

ZT  Vg^(x  ) = 0 for  all  i such  that  g (x  ) = 0 and  u^  > 0 , 
and  ZT  Vh^.  (x  ) = 0 for  all  j = l,...,p  . 

* 

If  these  conditions  hold,  then  it  follows  that  x is  an  isolated  (i.e., 
locally  unique)  local  minimum  of  P(0),  with  associated  (not  necessarily  unique) 

^ A 

"optimal"  Lagrange  multipliers  u ,w  . Conditions  (i)  and  (ii)  require 

first  derivatives  only  and  are  known  as  the  (first  order)  Kuhn-Tucker  conditions. 

jc  ic  & 

If  (x  , u ,w  ) satisfies  (i)  - (iii),  it  will  be  called  a "Kuhn-Tucker 
triple." 

A few  facts  concerning  these  conditions  should  be  noted,  since  ch*.v 
have  reference  to  interpreting  the  results  that  follow.  If  no  constraints 
i are  present,  then  if  we  suppress  reference  to  the  terms  associated  with  the 

constraints,  the  conditions  reduce  to 

(i)  x*eEn 

(ii)  V f (x  ) = 0 and 

(iii)  ZT  V2f (x*)  Z > 0 

n * 

for  every  nonzero  vector  ZeE  , and  the  conclusion  becomes  x is  an 

isolated  local  minimum  of  f(x)  . Thus,  the  conditions  reduce  to  the  well 

known  second  order  sufficient  conditions  for  an  unconstrained  local  minimum 

2 

of  f(x)  . If  the  Problem  P(0)  is  linear,  then  V^L  = 0 . In  this  case, 

the  sufficient  conditions  as  stated  cannot  possibly  hold,  unless  there  are 

no  nonzero  vectors  Z satisfying  the  stipulated  requirements,  in  which 

instance  there  is  no  inconsistency  in  the  given  conditions.  It  can  be  shown 

* , 

that  the  conclusion  that  x is  an  isolated  local  minimum  still  follows  under 
these  circumstances.  It  may  be  observed  that  the  nonexistence  of  any  vector 
Z as  stated  implies,  in  particular,  that  there  is  no  nonzero  vector 
orthogonal  to  all  the  binding  constraint  gradients,  hence,  there  must  be 
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n binding  constraints  whose  gradients  are  linearly  independent.  This 
* 

implies  that  x is  a vertex  of  the  polyhedron  defining  the  constraint 

* 

set,  consistent  with  the  conclusion  that  x is  an  isolated  local  minimum. 
Whether  the  problem  is  linear  or  not,  if  there  is  no  nonzero  vector  Z 
satisfying  the  given  conditions,  then  the  second  order  requirement  (iii) 
is  satisfied  in  a logical  sense,  and  in  fact  it  can  be  shown  that  the 
first  order  conditions  (i)  and  (ii)  are  then  sufficient  to  conclude  that  x 
is  an  isolated  local  minimum.  Finally,  if  P(0)  is  a convex  programming 
problem  [16,  Chapter  6]  then,  since  a local  solution  of  a convex  problem  is 
global,  the  sufficient  conditions  imply  that  x is  the  unique  global  solution. 

3.  First  Order  Changes  in  a Kuhn-Tucker  Triple 

The  following  result  provides  the  basis  for  the  development  that 
will  be  given  here. 

THEOREM  3.1  (Characterization  of  a Differentiable 

Kuhn-Tucker  Triple,  Fiacco  [15,  Theorem  2.1]) 

If 

(i)  the  functions  defining  P(e)  are  twice  continuously 

* 

differentiable  in  (x,e)  in  a neighborhood  of  (x  ,0)  ; 

(ii)  the  second  order  sufficient  conditions  (Fiacco  and 

McCormick  [16,  Theorem  A])  for  a local  minimum  of  P(0) 

it 

hold  at  x with  associated  Lagrange  multipliers  u* 

it 

and  w ; 

it 

(iii)  the  gradients  V g (x  ,0)  (for  i such  that 

X 1 

it  it 

g . (x  ,0)  ■ 0)  and  V h . (x  ,0)  (all  j)  are  linearly 
J-  * J 

independent;  and 

(iv)  strict  complimentarity  holds  at  (x  ,0)  with  respect 

* * * 
to  u , i.e.,  u^  > 0 when  g^(x  ,0)  ■ 0 (i*l,...,m); 


- 6 - 


T-377 


then 

* 

(a)  x is  a local  isolated  minimizing  point  of  P(0)  and 
the  associated  Lagrange  multipliers  u and  w are 
unique; 

(b)  for  e in  a neighborhood  of  0 , there  exists  a unique 
once  continuously  differentiable  vector  function 

y(e)  = [x(e)  , u(e)  , w(e)  ] (where  T denotes  trans- 
position) satisfying  the  second  order  sufficient  condi- 
tions for  a local  minimum  of  Problem  P(e)  such  that 

iji 

y(0)  = (x  ,u  ,w  ) and,  hence,  x(e)  is  a locally 
unique  minimum  of  P(e)  with  associated  Lagrange  multi- 
pliers u(e)  and  w(e)  ; and 

(c)  strict  complementarity  and  linear  independence  of  the 
binding  constraint  gradients  hold  at  x(e)  for  e 
near  0 . 

The  conditions  of  the  theorem  will  be  assumed  throughout  the 
remainder  of  the  paper. 

When  y(e)  is  available,  V£y(e)  - (V£x(e)T,  V£u(e)T,  V£w(e)T)T 

(an  (n+m+p)  by  1c  matrix)  can  be  calculated  by  noting  that  the  theorem 
implies  the  satisfaction  of  the  Kuhn-Tucker  conditions  for  P(e)  at 
y(e)  near  e = 0 , i.e. , 

VxL[x(e),  u (e ) , w(e),  e]  = 0 , 

ui(£)8i^X(E)  ’ £1  = 0 » 1 = • (3.1) 

hjtx(e),  e]  = 0 , i = 1,...,P  • 

Since  near  £ - 0 the  Jacobian,  M(e)  , of  this  system  with  respect  to 
(x,u,w)  is  nonsingular  under  the  given  assumptions,  the  total  derivative 
of  the  system  with  respect  to  e is  well  defined  and  must  equal  zero.  This 
yields 

M(e)V£y(e)  - N(e)  (3.2) 
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where  N(e)  is  the  negative  of  the  Jacobian  of  the  Kuhn-Tucker  system 
with  respect  to  e , and  hence 

Vey(e)  *»  M(c)-1N(e)  . (3.3) 

If  there  are  no  constraints  present  in  P(0)  then  Theorem  3.1 
reduces  to  the  statement  that  if  f(x,e)  is  twice  continuously  differentiable 

ic  ft  2 * 

in  (x,e)  near  (x  ,0),  and  if  V^f(x  >0)  = 0 and  V^f(x  »0)  is  positive 

* * 
definite,  then  x is  a local  isolated  unconstrained  minimum  of  f(x  ,0)  , 

and  there  exists  near  e * 0 a unique  once  continuously  differentiable 

2 

function  x(e)  satisfying  Vxf[x(e),e]  * 0 , with  Vxf[x(e),e]  positive 

* 

definite  and  such  that  x(0)  - x . Equation  (3.2)  becomes 

V^f[x(e),e]  Vex(e)  + -|^  ?xfT[x(e) »e]  ' 0 (3.4) 

and  hence  (3.3)  becomes 

V x(e)  - V^f[x(e) ,e]_1  V fT[x(e) ,e]  . (3.5) 

These  calculations  will  be  pursued  in  some  detail  in  Section  5.  First, 
however,  we  give  several  results  characterizing  the  optimal  value  function 
f[x(e),e]  that  follow  immediately  from  Theorem  3.1. 

4.  First  and  Second  Order  Changes  in 
the  Optimal  Value  Function 

Because  of  important  connections  with  Lagrange  multipliers  and  duality 
theory,  first  order  changes  in  the  optimal  value  function  have  traditionally 
been  analyzed  with  respect  to  variations  in  the  "right  hand  side"  of  the  con- 
straints. An  extension  to  perturbations  (of  all  problem  functions)  that  are 
linear  in  the  problem  parameters  was  obtained  by  Flacco  and  Mcuormick  [16]. 
Buys  [12]  also  derives  second  order  changes,  in  connection  with  an  analysis 
of  the  behavior  of  the  optimal  value  of  an  associated  augmented  Lagrangian 
function.  Here,  under  the  assumptions  of  Theorem  3.1,  it  is  shown  how  first 
and  second  order  results  follow  immediately  for  the  general  class  of  para- 
metric variations  being  considered.  The  referenced  results  are  obtained  as 
special  cases. 
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Let  y(e)  = [x(e)T,u(e)T,w(e)T]T  be  a Kuhn-Tucker  triple,  where 
x(e)  solves  Problem  P(e)  for  e near  0 . The  "optimal  value  function" 

la  defined  as 

f*(e)  = f [x(e) ,e]  , (4.1) 

and  the  "optimal  value  Lagrangian"  as 

L (e)  = L[x(e),u(£),w(e),e]  . (4.2) 

THEOREM  4.1  (First  and  Second  Order  Changes  in  the  Optimal 
Value  Function  of  Problem  P(e),  Armacost  and 
Fiacco  [4,  Theorem  3]) 

If  the  conditions  of  Theorem  3.1  hold  for  Problem  P(e),  then,  in  a 
neighborhood  of  e = 0 , 


(a)  f* (e)  = L*(e)  , (4.3) 

* m P 

(b)  V f (e)  = 3L/3e  = 3f/3e  - E u (e)[3g  /9e]  + E w.(e)[ 

i=l  1 j-1  3 

= 3f/3e  - u(e)T(3g/3e)  + w(e)T(3h/3e)  , 

and  hence  also 

(c)  vV(e)  = Ve[(3L/3e)T] 

= Vx[(3L/3e)T]  V£x(e)  - E [3g1/3e]T  V^e) 

P t 

+ E [3h./3e]  V w.(e)  + r 
j-1  3 E 3 3e 


Proof:  Recall  that  in  a neighborhood  of  £ = 0 , [x(e) ,e]  = 0 , 

i ■ 1 m , strict  complementary  slackness  holds,  hj[x(e),e]  = 0 , 

j ■ l,...,p  , and  y(e)  = [x(e)T,u(e)T,w(e)TJ^  e C . It  follows  immediately 
that 

f[x(e),e]  = L[x(e),u(e),w(e),e]  , (4.6) 


3hj /3e] 
(4.4) 


(4.5) 


i 


. 

i 
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yielding  (a).  Furthermore,  we  can  differentiate  (4.6)  to  obtain 


V£f*(£)  - V£f[x(e),e]  = V£L*(e)  * V£L[x(e) ,u(e) ,w(e) ,e] 


(4.7) 


■ V LV  x(e)  + V LV  u(e)  + V LV  w(e)  + 3L/3e  , 
xe  u e we 

where  L is  evaluated  as  in  (4.6). 

Since  the  Kuhn-Tucker  conditions  hold  at  y(e)  , it  follows  that 
V^L  ■ 0 . Complementary  slackness  implies  u^(0)  or  g^[a^(0),0]  = 0 , 

i ■ l,...,m  . Strict  complementary  slackness,  continuity  and  differen- 
tiability then  imply  one  of  two  consequences,  respectively: 


(i)  g1[x(0),0]  > 0 , implying  gi[x(e),e]  >0  for  e 

near  0 , implying  u^(e)  = 0 , implying  = 0 ; or 

(ii)  u^(0)  > 0 , implying  u^(e)  > 0 for  e near  0 , implying 
gi[x(e),£]  = 0 . 

From  this,  it  follows  that  V LV  u(e)  - (-g, [x(e) ,e] , . . . ,-g  [x(e),e]) 

u 6 1 m 

V£u(e)  ■ 0 . Also,  since  hj[x(e),e]  = 0 for  e near  0 , 

V LV  w(e)  « (h, [x(e) ,e] , . . . ,h  [x(e),e])V  w(e)  • 0 . We  therefore  conclude 

W C X P £ 

* 

from  (4.7)  that  V£f  (e)  * 3L/3e  for  e near  0 , proving  (b). 

Differentiation  with  respect  to  c of  the  result  obtained  in  (b) 

gives 

V*f*(e)  - Vx[3L/3eT]Vex(e)  + Vu[3L/3eT]V£u(e) 

+ Vw[3L/3eT]V£w(e)  + dZL/de2  . 

Calculation  of  the  derivatives  yields  (c) . 

To  be  perfectly  clear  about  what  is  involved  in  calculating  this 
Hessian,  ve  write  Equation  (4.5)  in  terms  of  the  original  problem  functions. 
We  have 
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2 * 

V‘f  (e) 


7*j(!f)T  - jx  “i<e)  fe)  + £ V£)  (S1) 


• V^x(e) 


- jjllr)  ' Vi(£>  + (51)  ' 7eV°}  • <4-8) 

(s2f  m ^s..  p 

+ < — 2 - 2 u (e) — + Z w (e)  — ---*■> 

(3e  1=1  1 8e  j=l  J dc  ) 


Equation  (A. 4)  reduces  to  previously  established  results  when  certain 
problem  structures  are  considered.  The  first  corollary  gives  the  well-known 
"Lagrange  multiplier  sensitivity  result,"  and  also  establishes  relations 
for  the  Hessian  of  the  optimal  value  function  taken  with  respect  to  the 
right  hand  side  of  the  constraints.  For  this  case.  Problem  P(e)  reduces 
to 

minimize  f (x) 

subject  to  g±(x)  >_  ei  , i = l,...,m  , R(e) 

Kj(x)  = Vm  ’ j " 1 ' 

COROLLARY  4.1  (First  and  Second  Order  Changes  in  the  Optimal 

Value  Function  for  Right  Hand  Side  Perturbations, 
Armacost  and  Fiacco  [4,  Corollary  3.1]) 


(i)  the  functions  defining  R(e)  in  a neighborhood  of 

e = 0 are  twice  continuously  differentiable  in  x , 

* 

in  a neighborhood  of  x , and 
(ii)  conditions  (ii)  - (iv)  of  Theorem  3.1  hold, 


then,  in  a neighborhood  of  e ■ 0 , 


(a)  V£f*(e)T 


• n • 

L-w(e)J 


2 * v “(e) 

(b)  V2f  (e)  = e 


-Vew(e) 
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Proof:  We  let 

h (x,e)  - h.(x)  - e 
J J J+111 


f(x,e)  - f (x) 
» J m It • • • »P 


9 


9 


gi(x,e)  = g±(x)  - , i - , 

and  apply  the  results  of  Theorem  4.1. 


A second  corollary  Is  a generalization  of  the  result  established  by 
Fiacco  and  McCormick  116,  Theorem  6]  for  the  problem 

minimize  f(x)  + e^a^Cx) 


subject  to  g^x)  + e^b^x)  >_  0 , 

hj(x)  + £j+mCj(x)  " 0 ’ 


i 1, . . . ,m  , 
j = 1* • • • tP  • 


R(e) 


COROLLARY  4.2  (First  Order  Changes,  Fiacco  and  McCormick 

[16,  Theorem  6],  Armacost  and  Fiacco  [4,  Corollary 
3.2],  and  Second  Order  Changes  in  the  Optimal 
Value  Function  for  Perturbations  Linear  in  the 
Parameters) 


If 

(i)  the  functions  of  Problem  R(e)  in  a neighborhood  of 

e ■ 0 are  twice  continuously  differentiable  in  x , 

* 

in  a neighborhood  of  x , and 
(ii)  conditions  (ii)  - (iv)  of  Theorem  3.1  hold, 
then,  in  a neighborhood  of  e - 0 , 


(a)  Vef*(e)T 


aQ[x(e)] 

-u^eJb^xU)] 


-um(e)bmlx(e)1 
Wf  (.e)  ci  [x(e>  ] 


wp(e)cp[x(e) ] 


aQ[x(e)]  ^ 
-B[x(e)]u(e) 
C[x(e) ]w(e) 


, and 
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where 


7xaQ[x(e)]  7£x(e) 

-U(e)  \7xb[x(e)]  V^xte)  - B[x(e)]  7£u(e)  , 

W(e)  7 c[x(e) ] 7 x(e)  + C[x(e)]  7 w(e) 

B X w t « 

U = diag(u^)  , B i diagCb^)  , i * , 


(b)  7*f*(e>  = 


W = diag(w^)  , C 5 diag(Cj)  , j - , 

T T 

b ■ (b  , . . . ,b  ) and  c ■ (c1t...,c  ) « 

1 m 1 p 


Proof:  We  let  f(x,e)  - f(x)  + eQa0(x)  , g^x.e)  - g^x)  + > 

1 - hj  (x,e)  - ty(x)  + eJ+mCj(x)  , j - l,...,p  , and  apply  the 

results  of  Theorem  4.1,  having  verified  as  in  Corollary  4.1  that  the  condi- 
tions of  Theorem  3.1  are  satisfied.  In  particular,  with  * denoting  evaluation 
at  e » 0 , we  have  that 


V (0)  “ (a0*“Ulbl "Umbm’W*Ci W5CJ}  ’ 


the  result  obtained  in  Q.6,  Theorem  6].  Conclusion  (b)  follows  from  differ- 
entiation of  (a) . 


A third  corollary  summarizes  the  well  known  results  that  follow  in 
the  absence  of  constraints.  Note  that  under  the  given  conditions,  the 
corollary  also  applies  if  constraints  are  present  in  P(0)  , but  are  not 
binding  at  x*  . 

COROLLARY  4.3  (First  and  Second  Order  Changes  in  the 
Unconstrained  Optimal  Value  Function) 

If  f(x,e)  is  twice  continuously  differentiable  in  (x,e)  near 
(x*,0)  and  if  ?xf(x  »0)  “ 0 and  Vxf(x*,0)  is  positive  definite,  then, 

in  a neighborhood  of  c - 0 , 
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(a)  V£f  (e) 


if 

3c 


and 


<b)  V^£*(e)  - ± (V/)  Vtx(e)  + 


or  equivalently, 
(b) 


’ - - [£  ‘7x£l)]  [&  <'xfT>]  + ■ 

Proof:  Suppress  reference  to  all  terms  involving  the  constraints 
in  Theorem  3.1  to  conclude  the  existence  of  x(e)  satisfying  the  conditions 
stated  in  the  conclusions  of  the  theorem  and  further  interpreted  in  the 
paragraph  just  preceding  Equation  (3.4).  Differentiate  f (e)  = f[x(e),e] 
with  respect  to  e and  use  Vxf[x(e),e]  =0  to  obtain  (a).  Differentiate 

o 

V f[x(e),e]  * 0 to  obtain  (b)  and  use  the  fact  that  7 f[x(e),e]  is  positive 

E X 


definite  and  (3.5)  to  obtain  (b) ' . 

It  is  interesting  to  note  that  if  the  constraints  of  P(e)  are 

independent  of  e , then  application  of  the  results  of  Theorem  4.1  yields 

it  2 * 

the  same  expressions  for  7£f  (e)  and  (e)  as  those  obtained  above 

in  (a),  (b)  and  (b'),  respectively,  for  the  unconstrained  problem. 

Note  from  Theorem  4.1  that  the  values  of  the  optimal  value  function 
and  its  gradient  can  be  calculated  once  the  Kuhn-Tucker  triple  y(e)  has 
been  determined.  However,  in  general,  the  value  of  the  Hessian  matrix  of 
the  optimal  value  function  requires  the  determination  of  both  the  triple 
and  its  first  derivatives. 

We  next  examine  various  aspects  of  calculating  these  derivatives. 

Aside  from  their  use  in  calculating  the  optimal  value  function  Hessian, 
they  are  of  considerable  importance  in  other  applications,  e.g.,  in  charac- 
terizing the  stability  of  the  solution  subject  to  perturbation  and  in  pro- 
viding a first  order  estimate  of  Kuhn-Tucker  triples  of  problems  Involving 
different  values  of  the  parameters,  once  one  such  triple  has  been  determined. 
The  analysis  leads  to  a study  of  the  Jacobian  M(c)  of  the  Kuhn-Tucker 
system  (3.1). 
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5.  Computation  of  First  Order  Changes 
in  the  Kuhn-Tucker  Triple 

Our  task  is  to  calculate  V£y(e)  for  Problem  P(c)  when  the 

conditions  of  Theorem  3.1  hold.  As  noted,  once  y(c)  is  available, 

Vey(e)  can  be  calculated  by  using  (3.2)  or  (3.3).  Appreciable  efficiencies 

in  computation  can  be  introduced  by  analyzing  the  various  possibilities. 

Conclusion  (c)  of  Theorem  3.1  implies  that,  near  e = 0 , u^e)  > 0 

if  u^(0)  > 0 and,  using  Assumption  (i) , we  can  also  conclude  that 

g1[x(e),e]  > 0 for  all  i such  that  g1[x(0),0]  > 0 , which  in  turn  implies 

that  u^(e)  = 0 whenever  g^[x(0),0]  > 0 . Using  these  facts  essentially 

allows  us  to  eliminate  those  terms  associated  with  constaints  that  are  not 
* 

binding  at  x , and  also  allows  us  to  divide  out  the  positive  u^(c)  from 
the  corresponding  complimentary  slackness  equations. 

This  leads  to  a considerable  simplification  of  the  Kuhn-Tucker 
conditions  (3.1),  which  must  hold  near  e - 0 . Without  loss  of  generality, 
assume  that  the  first  r inequality  constraints  are  binding.  We  are  thus 
lead  to  studying  the  system, 

VxL[x(e),u(e),w(e),e]  = 0 

-g[x(e),e]  = 0 (5.1) 

h[x(e),e]  = 0 , 

- T T - 

where  g - (g^,...,gr)  and  h = (h^,...,hp)  . (The  minus  sign  before  g 

leads  to  notatlonal  simplifications.) 

It  is  assumed  in  the  following  development  that  the  analysis  is 
confined  to  a neighborhood  of  e ■ 0 where  the  conclusions  of  Theorem  3.1 
are  valid. 

Differentiating  (5.1)  with  respect  to  e according  to  the  chain 
rule  yields 

M(e)  V£y (c)  - N(e)  (5.2) 
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where  the  Jacobians  M and  -N  of  (5.2)  with  respect  to  (x,u,w)  and 
c are,  respectively. 


M 


(5.3) 


and 


where 


and 


-N  = 


[(3(VxLT)/3e)T,-Oi/3e)T,  (3h/3e)V  , 
p - (-?xiT»  vxhT)T  , 

y(e)  = (x(e)T,u(E)T,w(e)T)T  , 

— X 

u(e)  = (u,(e),... ,ur(e))  . 


(5.4) 


(5.5) 


Under  the  given  conditions,  it  is  known  that  the  Jacobian  M(e)  of 
(3.1)  with  respect  to  (u,v,w)  is  nonsingular,  and  hence  it  follows  that  M 
defined  in  (5.3)  is  nonsingular,  for  e near  0 . Thus, 

Vey(e)  - M(e)  1 N(e)  . (5.6) 

Clearly,  any  method  for  solving  the  linear  system  of  Equation  (5.2) 
is  applicable  for  calculating  V£y(e)  , and  M need  not  be  inverted  as  in 

(5.6).  However,  under  the  given  assumptions,  the  work  involved  in  calculating 

M(e)  1 can  be  significantly  reduced,  as  will  presently  become  evident. 

2-1  2 

Consider  the  various  possibilities:  (1)  7 L exists;  (2)  7^L  **  0 ; 

(3)  r + p - n (i.e. , there  are  n binding  constraints),  or  (4)  r + p < n . 
Brief  reflection  will  indicate  that  all  possibilities  are  covered  by  these 
conditions  (in  fact,  under  the  given  assumptions,  either  (3)  or  (4)  must 
be  true,  but  (1)  and  (2)  are  also  specified  since  they  Introduce  further 
simplifications) . 
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Letting 


M(c) 


-1 


A11 

A12 

A 

A 

21 

22 

i the 

block 

(5.7) 


case  delineated,  requiring  at  most  the  inversion  of  an  n by  n matrix. 
The  first  three  cases  follow  readily  from  a straightforward  manipulation  of 
(5.3). 


2 -1 

Case  1.  V L exists. 

x 

2 — 1 T —1 

Since  the  assumptions  of  Theorem  3.1  guarantee  that  [PV^L  P ] exists, 

it  is  easily  shown  that 

A = 72L"1[1-PT(PV2L~1PT)-1PV2L"1] 

11  x x x 

T 2 -1  T 2 -1  T -1 

A12  = 21  = x P IPV  P ] (5‘8) 

A22  = -[PV^L-1?1]-1  . 

Case  2.  V2L  - 0 . 

x 


There  are  two  possible  situations:  there  are  r + p < n , or 
r + p - n linearly  independent  binding  constraint  gradients.  If  there  are 
less  than  n , Assumption  (ii)  of  Theorem  1 is  violated  (and  it  is  easily 

seen  that  M(e)  ^ does  not  exist),  so  this  is  not  allowed  under  the 
present  assumptions.  When  there  are  n linearly  independent  binding  con- 
straint gradients,  then  we  have  a special  instance  of  the  third  circumstance 
which  follows. 

Case  3.  There  are  n linearly  independent 
binding  constraint  gradients. 

The  n x n Jacobian  P of  the  n constraints  with  respect  to  x 
must  be  nonsingular.  Hence, 


4 


i 
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A11  = 0 


A12  " A21 


,-l 


(5.9) 


22 


-T  2 -1 

-P  V L P 
x 


Note  that  if  V2L_1 
x 

2t-1 


exists,  (5.9)  follows  from  (5.8);  however,  here  the 
is  not  assumed.  Also,  the  remaining  possibility  men- 


existence  of  VZL 
x 

tioned  in  Case  2 above  gives  = A^  = 0 and  A^2  = A^  = P_1  . (It 

may  be  of  interest  to  note  that  this  last  situation  characterizes  conditions 
that  hold  at  a nondegenerate  solution  of  a linear  programming  problem,  with 
n linearly  independent  binding  constraint  gradients  and  V^L  = 0 .) 


Case  4. 


r + p < n and  v L 4 0 
■ x 


This  is  the  least  structured,  and  hence  most  general,  situation  that 
can  be  encountered  under  the  given  assumptions.  Many  representations  of 

M ^ are  possible,  depending  on  how  the  data  are  organized.  However,  a 
general  representation  that  is  tailored  to  the  assumptions  we  are  making 
here  was  obtained  by  McCormick  [20]  and  will  serve  our  purpose  extremely 
aptly.  We  follow  his  development  here  rather  closely. 


First,  note  that  assumptions  (ii)  and  (iii)  imply  that,  at 
, * * * t o 

(x  ,u  ,w  ) with  e * 0 , we  must  have  Z V LZ  > 0 for  all  Z ? 0 such 

x 

that  PZ  = 0 , where  P * (-Vxg^,V  h^)^  , the  (r+p)  ' y n matrix  defined  in 

(5.5).  Hence,  if  S is  any  n by  [n-(r+p)J  matrix  that  generates  the 

null  space  of  P , then  Z * Sy  for  some  y in  En  implies  that 

PZ  - 0 (since  PS-0)  and  also  that  Z J*  0 if  y = 0 (since  we  are 
tacitly  assuming  that  S has  full  column  rank  [n-(r+p)])  and  hence, 

zTv2LZ  - yTSTV2LSy  > 0 , providing  that  y f 0 . We  conclude  that  D - STV2LS 
x ’ x 

is  a positive  definite  [n-(r+p)]  by  [n-(r+p)]  matrix.  Further,  since  P has 

rank  r+p  by  Assumption  (iii),  an  n by  (r+p)  pseudo- inverse  of  P 
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exists,  i.e.,  a rank,  (r+p)  matrix  satisfying  PP^P  = P . These 
constructs  and  observations  lead  to  a general  representation  of  the  block 

components  of  Fi^ 

As  indicated  in  [20],  Pr  and  S are  assumed  in  practice  to  be 
generated  by  some  matrix  technique  which  relates  these  quantities  by  the 

expression  I-P^P  = SW  , where  W is  some  [n-(r+p)]  by  n matrix.  Also, 

since  PP^  = I and  (3.1)  gives  V^f (x,0)  = -PT[ (u  )T, (w  )P]T  at  optimality, 

it  may  be  of  interest  to  note  that 

(U*\  # T * 

*J  = -(p  ) v(x’0)  • 

//  T 

This  motivates  the  widely  used  estimation  -(P  ) V^f  for  the  Lagrange 
multipliers  in  algorithms  involving  these  constructs. 


The  result  (equivalent  to  that  given  in  [20]),  in  forms  of  the  block 
components  A„  of  M*"  , follows  readily  and  is  given  by 

-1  T 

An  = SD  V , 


12 


T 2 # 

A21  " I^ll^x  L>P 


(5.10) 


A = -A  LP^ 

22  A21  x LP 


There  are  many  good  techniques  currently  available  for  calculating 
S and  P^  , motivated  by  various  numerical  efficiency,  stability  and 
algorithmic  considerations  [20].  We  mention  two  here  for  completeness  and 
because  they  are  precisely  tailored  to  the  calculations  associated  with  two 
important  families  of  mathematical  programming  algorithms,  reduced  gradient 
and  projected  gradient  type  algorithms. 

The  first  technique  is  associated  with  the  reduced  gradient  or 
variable  reduction  type  algorithms  for  nonlinear  programming,  and  is  a 
crucial  part  of  the  simplex  method  for  linear  programming.  It  is  based  on 
the  simple  observation  that  the  linear  independence  assumption  implies  the 
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existence  of  an  (r+p)  by  (r+p)  nonsingular  submatrix  P^  of  P . Assuming 

for  simplicity  that  the  first  r+p  columns  of  P are  linearly  independent, 
we  can  partition  P as  P = Pd»Pi^  ’ where  Pj  is  an  (r+P)  bY  [ r»—  ( r+-p ) ] 

T T T 

matrix.  This  induces  a natural  decomposition  of  the  variable  x = (x^.x^)  , 

— * * 

and  since  g(x  ,0)  = 0 and  h(x  ,0)  = 0 , allows  application  of  the 
implicit  function  theorem  to  conclude  that  there  exists  a twice  differentiable 
vector  function  x^x^)  such  that  gtXpCXj.)  .x^^.O]  = 0 and  h [x^Xj)  .x^O]  = 0 
* * * * 

near  x^  = x^.  , and  (XpfXjJ.Xj)  = x . The  Xp  and  Xj  may  be  thought  of 

as  "dependent"  and  "independent"  variables,  respectfully,  hence,  the  choice 
of  indices.  Once  the  binding  constraints  are  identified,  it  suffices  to 
minimize  f[xD(x1>,0]  over  xJ  using  any  appropriate  unconstrained  method, 

■Hf  A 

to  determine  x^  , and  hence  x . The  indicated  algorithms  actually  invoke 


the  linear  independence  assumption  for  all  feasible  boundary  points,  and 
hence  at  any  given  iteration  can  either  reduce  f(x)  without  encountering 
constraints,  or  will  be  in  a situation  completely  analogous  to  the  one 
described  at  the  outset,  and  can  proceed  to  minimize  f over  the  currently 
independent  variables  in  the  space  of  currently  binding  constraints. 


Returning  to  the  determination  of  S and  Pr  for  this  type  of 

T T T 

algorithm,  we  observe  that  S = (S^S^)  must  satisfy 
PS  " (PD,PI)(SD’SI)T  " 0 » so  SD  = _PD1  PISI  and  hence 


(ll  - • 


Similarly,  since  PP#P  - P , defining  P#  £ ((P  )T,  (P2)T>T  gives  the 


result 
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In  terms  of  the  quantities  defined,  the  block  components  A_ 
of  are  therefore  given  by  (5.10),  where 


and  T = 

Sj  is  any  [n-(r+p)]  square  nonsingular  matrix,  and  is  any  [n-(r+p)J 

by  (r+p)  matrix. 

For  the  projected  gradient  type  algorithms,  the  gradients  of 
binding  constraints  are  again  assumed  linearly  independent  at  feasible 
boundary  points,  with  the  data  being  organized  mainly  to  accommodate  a 

T T -1 

projection  matrix  of  the  form,  PR  = I - P (PP  ) P , used  to  project  a 
given  direction  vector  into  the  linear  subspace  associated  with  the  cur- 
rently binding  constraints.  Here,  the  rows  of  P would  represent  the 
gradients  of  the  constraints  currently  deemed  to  remain  binding  in  the 

T 

next  iteration,  and  PP  is  an  (r+p)  by  (r+p)  matrix,  nonsingular  under 

, ^ T T “1 

the  linear  independence  assumption.  We  find  that  P*  ■ P (PP  ) satis- 
fies the  requirements  for  a pseudo- inverse  of  P , and  it  follows  easily 
that  a suitable  choice  for  S is  any  matrix  SR  formed  by  selecting  any 

[n-(i+p)]  linearly  independent  columns  of  P . The  block  components  of 

fv 

M*  for  gradient  projection  type  calculations  are  therefore  the  same 

as  (5.10),  with  S = S_  and  P#  = PT(PPT)_1  . 

K 

Therefore,  returning  to  the  calculation  of  the  derivatives  of  the 
Kuhn-Tucker  triple  we  evaluate  (5.6)  for  the  representation  (5.7)  of 

and  the  expression  (5.4)  of  N to  obtain 


S = T S 


-P  _1P 
D I 


p # _ [V1 

1 L oJ 


+ T P , 
2 


, an  n by  [n-(r+p)J  matrix  with  rank  [n-(r+p)]  , 
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V(e) 


V£x(e) 


V£u (e) 


V w(e) 
e 


M(e)"1  N(e) 


-A 


3(V  LT) 
x 


11  3e 


12 


5£ 

3e 

. iii 

3e 


-A 


3 (V  LT) 
x 


21  3e 


+ A 


22 


3e 

_3h 

3e 


(5. 


the  being  given  by  (5.8),  (5.9)  or  (5.10),  depending  on  the  respective 

conditions  that  apply  and  depending  on  how  the  data  are  organized. 

2 * 

The  Hessian  (e)  (4.5)  of  the  optimal  value  function  may  also 

be  readily  calculated  once  (5.11)  has  been  evaluated.  To  do  this  efficiently 

first  note  that  we  may  rewrite  (4.5)  as 
« 

V f*(e)  - 32L/3e2  + [(3(V  LT)/3e)T,  -(3g/3e)T,  (3h/3e)T]  V y(e)  . (5.12) 

X E 

2 * 

Denoting  by  V f the  "reduced  Hessian"  that  results  from  eliminating  terms 

associated  with  nonbinding  constraints,  and  using  the  previous  notation,  we 
obtain  the  concise  expression, 

V2f*(e)  - 32L/3e2  - NTV£y(c)  , (5.13) 

or  equivalently, 

V2f*(e)  - 32L/3e2  - NTM  _1  N . (5.14) 

The  Hessian  can  now  be  calculated  from  the  given  problem  data  and  (5.13) 
or  (5.14),  using  (5.11),  evaluating  the  as  given  in  (5.8),  (5.9)  or 

(5.10),  depending  on  which  conditions  apply. 

For  Problem  &(c),  (5.11)  simplifies  considerably,  as  shown  by 
Armacost  and  Fiacco  [5].  It  is  easy  to  verify  that  for  this  problem. 


ID 
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) 

. 

i 


3(VxL  )/3e-  0,  3g/3e  = [-1,0]  and  3h/3e  = [0,-1]  . Therefore,  (5.11) 
becomes 


?ey(e) 


- *5 

V£x(e) 

A12 

r °]1 

V£u(e)j 

A22 

r-1  o-i 

V£w(e) 

Lo  ij 

(5.15) 


2 * 

The  general  formulas  for  V f (e)  also  simplify  for  Problem  R(e)  . 

T 


2 2 

Observe  that  for  this  problem,  7 L/3e  =0  and 


■ ■ [•  r:  :i] 


Hence, 


using  the  form  (5.13)  we  obtain 


2 * 

V f (e) 


r _ 

“ 

p»  — 

r 

r-1 

0 

7 x 

7 u 

e 

e 

o 

w 

o 

I_ 

7 u 

-7  w 

L 

e 

£ 

m m 

7 w 

e 

which  essentially  agrees  with  the  result  obtained  in  Corollary  4.1,  and 
using  the  form  (5.14)  we  obtain  the  interesting  result. 


2 * 
7^f 


“ ■ - r:  3 - r:  3 


(5.16) 


Aside  from  the  considerable  computational  simplification  compared 
to  the  general  problem,  these  results  provide  additional  insights  into  the 
structure  of  the  solution,  since  we  have  explicit  relationships  for  the 
various  parameter-derivatives  in  forms  of  quantities  associated  with  the 
original  problem  functions.  For  example,  noting  the  result  (5.16)  and 
the  various  possibilities  for  given  in  (5.8),  (5.9)  or  (5.10),  we 

can  see  directly  that  f (e)  (associated  with  R(e))  is  convex  in  a 
neighborhood  of  e ■ 0 if  the  Lagrangian  L(y,e)  of  R(c)  is  convex 
in  x . This  well  known  fact  and  several  related  and  less  well  known 
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inferences  associated  with  Problem  R(e)  can  be  shown  explicitly  using  the 
given  formulas.  For  additional  details  and  results,  see  the  paper  by 
Armaco8t  and  Fiacco  [5]. 


6.  Discussion  of  Results  and  Extensions 

The  nontrivial  computational  considerations  associated  with  checking 
whether  all  the  conditions  are  satisfied,  as  required  by  the  assumptions 
of  Theorem  3.1  and  further  compounded  by  the  refinements  associated  with 
the  appropriate  calculation  of  the  A , are  typical  of  analogous  verifi- 
cation problems  confronted  by  most  numerical  procedures.  Additional  diffi- 
culties are,  of  course,  associated  with  the  (typical)  requirement  to  essen- 
tially know  the  solution  before  the  conditions  required  to  solve  the  problem 
can  be  verified  or  the  solution  analyzed.  Such  concerns  are  outside  the 
scope  of  the  presentation,  that  is  primarily  concerned  with  the  existence 
and  characterization  of  relationships  that  hold  at  a solution.  However,  a 
few  relevant  comments  can  be  offered. 


As  stated  briefly  at  the  outset,  a method  for  estimating  solution 
sensitivity  information  by  using  penalty  function  methods  was  established 
by  Fiacco  [15],  implemented  on  the  computer  by  Armacost  and  Mylander  [8] 
and  extended  and  applied  by  Armacost  and  Fiacco  [3]  - [6].  This 
approach  is  based  on  the  fact  that  the  local  solution  matrix  of  first  par- 

tlal  derivatives  V£x(e)  , the  optimal  value  f (e)  and  the  gradient 
* 2 * 

V£f  (e)  and  Hessian  V£f  (e)  of  the  optimal  value  function,  are  component 

by  component  limits  of  the  parameter-derivatives  of  the  penalty  function 
minimizing  point,  optimal  penalty  function  (parameter)  gradient  and  Hessian, 
respectively,  under  the  given  conditions.  In  effect,  a class  of  algorithms 
was  shown  to  generate  a trajectory  that  both  terminates  at  a solution  and 
rather  faithfully  reflects  the  perturbation  behavior  (subject  to  parameter 
changes)  of  the  solution,  as  the  solution  is  approached.  Furthermore,  the 
calculations  required  to  determine  the  sensitivity  information  turn  out  to 
be  of  the  same  form  as  the  calculations  required  by  the  algorithm  to  gen- 
erate a solution  trajectory.  Thus,  for  such  algorithms  applied  to  problems 
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satisfying  the  appropriate  conditions,  increasingly  accurate  estimates  of 
the  sensitivity  information  is  available  with  little  extra  effort,  as  the 
solution  is  approached,  i.e.,  the  solution  need  not  be  known  in  advance 
of  easily  determining  certain  aspects  of  its  behavior.  (Another  example 
of  this  sort  of  result  is  the  calculation  of  error  bounds  in  solving  sys- 
tems of  equations.  For  an  application  to  nonlinear  programming,  see  the 
paper  by  Robinson  [22].) 

A second  theoretical  advantage  of  the  penalty  function  approach  to 
estimating  solution  sensitivity  involves  the  calculation  of  the  A„  defining 

(5.7)  and  is  worth  noting.  Under  the  assumed  conditions,  the  Hessian 
of  the  penalty  function  is  positive  definite  near  a solution.  (See  Fiacco 
and  McCormick  [16,  Theorem  12]  and  Fiacco  [15]).  The  stationarity  condition 
of  the  penalty  function  at  the  minimizing  point  essentially  approximates 
(with  appropriate  interpretation)  the  information  given  in  (3.1),  and  the 
result  is  a single  formula  (obtained  by  Fiacco  [15])  for  the  approximation 
of  V£y(e)  . Thus,  there  are  no  alternative  calculations  such  as  (5.8),  (5.9), 
or  (5.10)  that  depend  on  the  status  of  the  solution.  (Armacost  and  Fiacco  [4] 
provide  a detailed  treatment  of  the  penalty  function  estimates.) 

The  latter  advantage  has  been  shown  to  extend  to  augmented  Lagrangian 
functions  by  Armacost  and  Fiacco  [7],  Armacost  [2],  and  Buys  and  Gonin  [13]. 
Indeed,  it  is  clear  that  unique  formalas  for  V^yCe)  will  obtain  for  that 
family  of  generalized  Lagrangians  and  exact  penalty  functions  that  are  struc- 
tured such  that  their  Hessians  are  positive  definite  at  a Kuhn-Tucker  triple 
under  the  conditions  of  Theorem  3.1.  A large  class  of  such  Lagrangians  was 
developed  by  Arrow,  Gould  and  Howe  [9].  Essentially,  if  the  extended 

Lagrangian  is  denoted  by  <f>  then,  since  the  role  of  $ is  precisely  analogous 

2 

to  the  role  of  the  usual  Lagrangian  L , and  since  = 0 and  V^q>  is 

positive  definite  at  a Kuhn-Tucker  triple,  it  follows  that  <j>  can  replace 
L in  the  results  given.  In  particular,  it  follows  that  the  A^  that 

determines  are  uniquely  given  by  (5.8),  with  <J>  replacing  L in 

those  formulas. 
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It  is  also  clear  that,  if  the  Lagrange  multipliers  (of  such  an 
extended  Lagrangian)  are  sufficiently  smooth  functions  of  the  problem 
parameters  that  converge  to  (locally)  optimal  multipliers,  then  the 
associated  minimizing  point  of  the  Lagrangian  function,  along  with  the 
parameter-derivatives  of  the  minimizing  point,  will  converge  respectively 
to  x(e)  and  V£x(e)  , and  thfe  optimal  Value,  gradient  and  Hessian  of 

* * 2 * 

the  Lagrangian  will  converge  to  f (e)  , V^f  (e)  and  V£f  (e)  , respec- 
tively. Thus,  these  functions  also  give  rise  to  techniques  for  estimating 
sensitivity  information  prior  to  the  determination  of  a solution,  analogous 
to  those  obtained  for  penalty  functions.  An  extension  of  the  class  of 
algorithms  that  can  be  so  utilized  should  continue  to  be  an  Interesting 
subject  of  future  research. 

In  the  other  direction,  that  of  using  solution  sensitivity  information 
to  characterize  algorithmic  behavior,  interesting  examples  are  the  proof 
by  Meyer  [21]  of  convergence  of  a family  of  algorithms  and  the  determination 
by  Robinson  [22]  of  the  convergence  and  rate  of  convergence  of  a large 
class  of  algorithms. 

Finally,  though  we  have  concentrated  on  sensitivity  analysis  and 
developed  neighborhood  results,  some  of  these  results  may  be  expected  to 
extend  to  parametric  nonlinear  programming,  where  the  parameters  are  per- 
mitted to  range  in  a prescribed  set.  A characterization  and  sensitivity 
and  stability  analysis  of  parameter-dependent  solutions  will  undoubtedly 
be  a subject  of  sustained  future  investigation.  It  seems  apparent  that 
results  "in  the  large"  will  depend  critically  on  neighborhood  results  such 
as  those  presented  here. 

An  immediate  application  of  the  sensitivity  analysis  results 
obtained  here  is  a calculation  of  first  order  estimates  of  a Kuhn-Tucker 
triple  of  a problem  with  parameter  changes,  and  first  and  second  order 
estimates  of  the  optimal  value  function,  using  Taylor's  series  expansions. 

If  x(0)  is  a solution  of  Problem  P(0)  satisfying  the  conditions  of 
Theorem  3.1,  then  a fltst  order  estimate  of  the  optimal  value  function. 
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f (e)  2 f[x(e),c]  , for  c in  a neighborhood  of  0 , is  given  by 
f*(e)  = f*(0)  + Vef*(0)t  , 


(6.1) 


and  a second  order  estimate  is  given  by 

f*(e)  = f*(0)  + V f*(0)e  + ~ eT72  f*(0)c  , (6.2) 

£ l £ 

where  f (0)  = f(x  ,0)  and  Vef  (e)  and  V*f  (c)  are  defined  by  (4.4)  and  (4.5), 

respectively.  A first  order  estimate  of  the  Kuhn-Tucker  triple  y(e)  is 
given  by 

y(e)  = y(0)  + V£y(0)e 


u 


w 


+ M~1(0)N(0)e  , 


(6.3) 


where  we  have  used  Conclusion  (1)  of  Theorem  3.1  and  Equation  (3.3). 


An  ever  important  general  application  of  sensitivity  analysis  is  the 
determination  of  those  parameters  to  which  a solution  is  the  most  sensitive. 

In  the  context  of  mathematical  programming,  if  the  optimal  value  or  one  or 
more  components  of  a solution  vector  or  any  of  the  constraints  can  change 
erratically  for  small  changes  in  a parameter,  there  is  little  comfort  in 
having  a particular  solution  at  hand  for  the  given  data,  if  the  data  is  (as 
usual)  subject  to  errors  or  alterations  that  can  exceed  these  "small  changes." 
A sensitivity  analysis  can  thus  lead  to  the  more  likely  sources  of  instability 
in  the  model  and  to  a further  study  of  data  inaccuracy  (e.g.,  suggesting 
more  observations  to  reduce  the  variance  of  sample  estimates,  as  in  a chance 
constrained  formulation  of  a problem  studied  by  Armacost  and  Fiacco  [3]).  It 
can  also  suggest  reformulating  the  model  to  eliminate  various  instabilities 
(e.g.,  by  refraining  from  expressing  an  equality  constraint  as  two  inequal- 
ities, the  consequences  of  which  are  easily  seen  to  make  singular  the  Jacobian 
M of  the  Kuhn-Tucker  system  (3.1),  the  computational  implications  being 
dramatically  conveyed  by  Robinson  [23])  or  introducing  "regularizations," 
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i.e. , estimates  or  perturbations  that  introduce  stabilities  (e.g., 
replacing  nondif ferentiable  functions  by  differentiable  approximations 
or  perturbing  a function  so  that  the  Hessian  of  the  perturbed  function  is 
nonsingular  in  an  appropriate  domain,  as  in  the  various  definitions  of 
augmented  Lagragians). 

The  sensitivity  information  for  the  optimal  value  function  and  the 

Kuhn-Tucker  triple  can  also  be  used  to  drive  various  "cyclic"  procedures 

for  solving  problems  involving  optimization,  e.g.,  in  solving 

■in  max  F(x,y)  by  cycling  between  descent  moves  in  x-space  and  ascent 
x y 

moves  in  y-space,  where  the  parameter  e of  P(e)  would  essentially  momen- 
tarily correspond  to  that  subset  of  variables  that  are  considered  to  be 
"independent"  for  a given  iteration.  An  excellent  discussion  of  this  sort 
of  method  may  be  found  in  a paper  by  Hogan  [17]  and  a recent  application 
using  penalty  function  approximations  mentioned  earlier  and  validated  in  [IS] 
was  given  by  de  Silva  [14].  The  latter  involves  the  solution  of  an  implicitly 
defined  optimization  model  of  U.S.  crude  oil  production. 

For  Problem  R(e) , where  the  parameters  are  the  right-hand  side  of 
the  constraints,  the  Kuhn-Tucker  triple  derivatives  (5.15)  and  the  Hessian 
of  the  optimal  value  function  (5.16)  are  relatively  easy  calculations  and 
should  have  powerful  application  in  solving  large-scale  problems  by  intro- 
ducing Newton-type  techniques  in  the  various  established  decomposition  pro- 
cedures. Problems  of  this  type  are  also  intimately  involved  in  much  of 
duality  theory  and  sensitivity  information  may  have  useful  application  in 
defining  and  accelerating  algorithms  for  solving  R(0)  by  various  dual 
methods.  Sensitivity  results  for  Problem  R(e)  are  treated  in  considerable 
detail  by  Armacost  and  Fiacco  [5].  Potential  applications  are  abundant. 

He  have  presented  a number  of  basic  results  for  a locally  rather 
ideally  behaved  class  of  nonlinear  programming  problems.  Results  involving 
the  general  behavior  of  the  optimal  value  function  and  a given  solution  or 
solution  set,  under  less  stringent  assumptions,  have  been  known  for  some 
time,  and  numerous  significant  refinements,  extensions,  and  generalizations 
have  been  obtained  only  recently.  The  subject  of  sensitivity  and  stability 
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analysis  in  nonlinear  programming  is  finally  receiving  the  attention  it 
deserves.  The  reader  interested  in  pursuing  the  subject  further  may  make  an 
excellent  start  by  studying  the  articles  [17],  [18],  and  [19]  by  Hogan  and 
consulting  the  numerous  references  therein. 


< 
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